Learning Context-Free Grammars with a Simplicity Bias
نویسندگان
چکیده
We examine the role of simplicity in directing the induction of context-free grammars from sample sentences. We present a rational reconstruction of Woll's SNPR { the Grids system { which incorporates a bias toward grammars that minimize description length. The algorithm alternates between merging existing nonterminal symbols and creating new symbols, using a beam search to move from complex to simpler grammars. Experiments suggest that this approach can induce accurate grammars and that it scales reasonably to more diicult domains.
منابع مشابه
Simplicity and Representation Change in Grammar Induction
In this paper we examine the role of a bias toward simplicity in directing the process of representation change. We focus on the task of inducing context-free grammars from sample sentences, and we present a rational reconstruction of Woll's SNPR { the Grids system { that incorporates the simplicity bias. The basic induction method alternates between merging existing nonterminal symbols and cre...
متن کاملLearning context-free grammars to extract relations from text
In this paper we propose a novel relation extraction method, based on grammatical inference. Following a semisupervised learning approach, the text that connects named entities in an annotated corpus is used to infer a context free grammar. The grammar learning algorithm is able to infer grammars from positive examples only, controlling overgeneralisation through minimum description length. Eva...
متن کاملSolving Trigonometric Identities with Tree Adjunct Grammar Guided Genetic Programming
Genetic programming (GP) may be seen as a machine learning method, which induces a population of computer programs by evolutionary means (Banzhaf et al. 1998). Genetic programming has been used successfully in generating computer programs for solving a number of problems in a wide range of areas. In (Hoai and McKay 2001), we proposed a framework for a grammar-guided genetic programming system c...
متن کاملLearning restricted probabilistic link grammars
We describe a language model employing a new headeddisjuncts formulationof Lafferty et al.’s (1992)probabilistic link grammar, together with (1) an EM training method for estimating the probabilities, and (2) a procedure for learning some simple lexicalized grammar structures. The model in its simplest form is a generalization of n-gram models, but in its general form possesses context-free exp...
متن کاملInducing Tree-Substitution Grammars
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000